Development of Speech Input Method for Interactive VoiceWeb Systems

نویسندگان

  • Ryuichi Nisimura
  • Jumpei Miyake
  • Hideki Kawahara
  • Toshio Irino
چکیده

We have developed a speech input method called “w3voice” to build practical and handy voice-enabled Web applications. It is constructed using a simple Java applet and CGI programs comprising free software. In our website (http://w3voice.jp/), we have released automatic speech recognition and spoken dialogue applications that are suitable for practical use. The mechanism of voice-based interaction is developed on the basis of raw audio signal transmissions via the POST method and the redirection response of HTTP. The system also aims at organizing a voice database collected from home and office environments over the Internet. The purpose of the work is to observe actual voice interactions of human-machine and human-human. We have succeeded in acquiring 8,412 inputs (47.9 inputs per day) captured by using normal PCs over a period of seven months. The experiments confirmed the user-friendliness of our system in human-machine dialogues with trial users.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

An Information-Theoretic Discussion of Convolutional Bottleneck Features for Robust Speech Recognition

Convolutional Neural Networks (CNNs) have been shown their performance in speech recognition systems for extracting features, and also acoustic modeling. In addition, CNNs have been used for robust speech recognition and competitive results have been reported. Convolutive Bottleneck Network (CBN) is a kind of CNNs which has a bottleneck layer among its fully connected layers. The bottleneck fea...

متن کامل

A Database for Automatic Persian Speech Emotion Recognition: Collection, Processing and Evaluation

Abstract   Recent developments in robotics automation have motivated researchers to improve the efficiency of interactive systems by making a natural man-machine interaction. Since speech is the most popular method of communication, recognizing human emotions from speech signal becomes a challenging research topic known as Speech Emotion Recognition (SER). In this study, we propose a Persian em...

متن کامل

روشی جدید در بازشناسی مقاوم گفتار مبتنی بر دادگان مفقود با استفاده از شبکه عصبی دوسویه

Performance of speech recognition systems is greatly reduced when speech corrupted by noise. One common method for robust speech recognition systems is missing feature methods. In this way, the components in time - frequency representation of signal (Spectrogram) that present low signal to noise ratio (SNR), are tagged as missing and deleted then replaced by remained components and statistical ...

متن کامل

The Effect of Explicit and Implicit Instruction through Plays on EFL Learners’ Speech Act Production

Despite the general findings that address the positive contribution of teaching pragmatic features to interlanguage pragmatic development, the question as to the most effective method is far from being resolved. Moreover, the potential of literature as a means of introducing learners into the social practices and norms of the target culture, which underlie the pragmatic competence, has not been...

متن کامل

The Effect of Comprehensible Input and Comprehensible Output on the Accuracy and Complexity of Iranian EFL Learners’ Oral Speech

This study aimed at investigating the relative impact of comprehensible input and comprehensible output on the development of grammatical accuracy and syntactic complexity of Iranian EFL learners’ oral production. Participants were 60 female EFL learners selected from a whole population pool of 80 based on the standard test of IELTS. To investigate the research questions, the participants were ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2009